Fix dispatcher crash on gevent LoopExit exceptions#2107
Conversation
Handle gevent LoopExit exceptions gracefully to prevent dispatcher crashes. Add exception handling in main loop and lock_machines() call, with loop exit counter (max 10) to prevent infinite restarts. Isolate child processes using start_new_session=True so job supervisors continue running independently if dispatcher encounters exceptions. Signed-off-by: deepssin <deepssin@redhat.com>
|
Do we have any reference ticket in tracker for this issue? |
zmc
left a comment
There was a problem hiding this comment.
This looks like a really well thought out changeset! Considering the dispatcher is difficult to fully validate with unit tests, can you describe briefly how you've tested this so far? Thanks!
|
@zmc For testing, I validated the changes in my OpenStack setup by:
|
Excellent - thanks for testing so thoroughly! |
|
Deployed just now on |
Handle gevent LoopExit exceptions gracefully to prevent dispatcher crashes. Add exception handling in main loop and lock_machines() call, with loop exit counter (max 10) to prevent infinite restarts. Isolate child processes using start_new_session=True so job supervisors continue running independently if dispatcher encounters exceptions.